A Compact FP-Tree for Fast Frequent Pattern Retrieval

نویسنده

  • Tri-Thanh Nguyen
چکیده

Frequent patterns are useful in many data mining problems including query suggestion. Frequent patterns can be mined through frequent pattern tree (FPtree) data structure which is used to store the compact (or compressed) representation of a transaction database (Han, et al, 2000). In this paper, we propose an algorithm to compress frequent pattern set into a smaller one, and store the set in a modified version of FP-tree (called compact FP-tree) as an inverted indexing of patterns for later quick retrieval (for query suggestion). With the compact FP-tree, we can also restore the original frequent pattern set. Our experiment results show that our compact FP-tree has a very good compression ratio, especially on sparse dataset which is the nature of query log.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Intelligent Approach to Information Retrieval System Using Enhanced DIG and FP-Tree Techniques

Information retrieval is the process of retrieving all the relevant documents that satisfies the user query from large corpora. It is aimed to provide the relevant information and documents that matches the user query. Outcome of the several research results confirms that difficulties in information retrieval are matching the query with corpus. Consequently, the enhanced indexing technique name...

متن کامل

A Scalable Algorithm for Constructing Frequent Pattern Tree

Frequent Pattern Tree (FP-Tree) is a compact data structure of representing frequent itemsets. The construction of FP-Tree is very important prior to frequent patterns mining. However, there have been too limited efforts specifically focused on constructing FP-Tree data structure beyond from its original database. In typical FPTree construction, besides the prior knowledge on support threshold,...

متن کامل

Efficient single-pass frequent pattern mining using a prefix-tree

The FP-growth algorithm using the FP-tree has been widely studied for frequent pattern mining because it can dramatically improve performance compared to the candidate generation-and-test paradigm of Apriori. However, it still requires two database scans, which are not consistent with efficient data stream processing. In this paper, we present a novel tree structure, called CP-tree (compact pat...

متن کامل

Incrementally fast updated frequent pattern trees q

The frequent-pattern-tree (FP-tree) is an efficient data structure for association-rule mining without generation of candidate itemsets. It was used to compress a database into a tree structure which stored only large items. It, however, needed to process all transactions in a batch way. In real-world applications, new transactions are usually inserted into databases. In this paper, we thus att...

متن کامل

Improved algorithm for mining maximum frequent patterns based on FP-Tree

Mining association rule is an important matter in data mining, in which mining maximum frequent patterns is a key problem. Many of the previous algorithms mine maximum frequent patterns by producing candidate patterns firstly, then pruning. But the cost of producing candidate patterns is very high, especially when there exists long patterns. In this paper, the structure of a FP-tree is improved...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013